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Abstract 

Abstract: - We study dynamics of spread of epidemics of SIR type in a realistic spatially-explicit 
geographical region, Southern and Central Ontario, using census data obtained from Statistics 
Canada, and examine the role of population mixing in epidemic processes. Our model incorporates 
the random nature of disease transmission, the discreteness and heterogeneity of distribution of 
host population. We find that introduction of a long-range interaction destroys spatial correlations 
very easily if neighbourhood sizes are homogeneous. For inhomogeneous neighbourhoods, very 
strong long-range coupling is required to achieve a similar effect. Our work applies to the spread 
of influenza during a single season. 
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I. INTRODUCTION 



Studies of some epidemics, for example, the spread of the Black Death in Europe from 
1347-1350 111, the past influenza pandemics spread of fox rabies in Europe, or spread 
of rabies among raccoons in eastern United States and Canada j^, indicate that host and 
infective interactions and spatial distributions of their populations should play an important 
role in the dynamics of spread of many infectious diseases. 

Until recently most mathematical models of spread of epidemics have described interac- 
tions of large number of individuals in aggregate form and often these models have neglected 
aspects of spatial distribution of populations, importance of which have been addressed in 
0]. Adopting methodologies like cellular automata, coupled map lattices, lattice gas cellular 
automata or age nt based simulations, new classes of models have been proposed and studied 



to incorporate with various levels of abstraction and details: direct interac- 
tions among individuals; spatial distribution of population types (i.e., infective, susceptible, 
removed); individuals' movement; effects of social networks on spread of epidemics. 

The goal of our work is to study the effects of population interactions and mixing on the 
spatio-temporal dynamics of spread of epidemics of SIR (susceptible-infected-removed) type 
in a realistic population distribution. For the purpose of our study we developed a fully 
discrete individually-based simulation model that incorporates the random nature of disease 
transmission. The key feature of this model is the fact that for each individual the set of all 
individuals with whom he/she interacts may change with time. This results in time varying 
small world network structure. 

As a case study we considerder census data obtained from Statistic Canada for 
Southern and Central Ontario. The data set specifies population of small areas composed of 
one or more neighbouring street blocks, called "dissemination areas" . Using these data, we 
study the effects of two types of interactions among individuals on the spread of epidemics. 
The first type of interaction is the one among individuals located only in adjacent dissemi- 
nation areas. The second type of interaction is the one among individuals who in addition 
to being in contact with members of their own and adjacent dissemination areas may also 
be in contact with individuals located in remote, non adjacent, dissemination areas. This 
last case can be seen as a case of "short-cuts" among multiple far away dissemination ar- 
eas. We investigate spatial correlations in our model and how they can be destroyed by the 
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"short-cuts" in population contacts. Additionally, we derive a mean field description of our 
individually based simulation model and compare the results of the two models. 

The presented work is continuation and expansion of our work in P, 0, and contributes 
to better understanding of spread of epidemics of SIR type, including influenza. 

II. SIR EPIDEMIC MODEL DESCRIPTION 

In order to study how population interactions ("population mixing") affects the spread of 



epidemics we construct an individual 
by a particle, as in our earlier work 



y ba 



Dased model in which each individual is represented 
Models of this type take various forms, ranging 



from stochastic interacting particle systems m to models based on cellular automata or 

pnrin 

coupled map lattices |7l 1131. 114 115|. 



In our model, we consider a set of individuals, labelled with consecutive integers 
1,2 . . . , N. This set of labels is denoted by C. We assume that each individual, at any given 
time can be in one of three, mutually exclusive, distinct states: susceptible (S), infected (I) 
or removed (R). An individual can change state only in two ways: a susceptible individual 
who comes in direct contact with an infected individual can become infected with probability 
p; an infected individual can become removed with probability q. The precise description of 
the model is as follows. 

The state of the i-th individual at the time step k is described by a Boolean vector 
variable ri{i,k) = {ris{i,k),rij{i,k),r]ji{i,k)), where r)T-{i,k) = 1 if the z-th individual is in 
the state r, where r G {S, I, R}, and ?7^(z, k) = otherwise. Thus, 

Vsih k) © r]i{i, k) © riii{i, k) = 1. 

We assume that i = 1,2, ... ,N and A; G N, that is, the time is discrete. Hence, the only 
allowed values of the vector r]{i,k) are 



r]{i,k) 



(1,0,0), if the i -th individual's state G S 
(0,1,0), if the i -th individual's state G / 

(0, 0, 1), if the i -th individual's state G R 



No other values of r]{i, k) are possible in SIR epidemic model. In SIR model an individual can 
only be at any give time and transitions occur only from susceptible to infected 



3 



and from infected to removed. A removed individual does not become susceptible or infected 
again in SIR model. Therefore, SIR model is suitable for studying spread of influenza in the 
same season because the same type of influenza virus can infect an individual only once and 
once the individual is recovered from the flu it becomes immune to this type of virus. 

We further assume that at time step k the i-th individual can interact with individuals 
from a subset of C, to be denoted by C{i,k). Using this notation, after one time iteration 
the rir{i, k + 1) becomes 

rjs{i,k + l) = r]s{i,k) JJ (1 - ,.r/,(j, A;)), (1) 

iec(i,fc) 

r/,(z, k + 1) = r]s{t, k)ll- J] (1 - X,j,kViij, k)) + m{^, k){l - F,,^), (2) 

\ jec(i,fc) / 
rjR{i,k + l) = r]R{i,k) + r]j{i,k)Yi, (3) 

where X = {Xij^^ '■ hj = 1) ■■■,N and k = 1,2, ....} is a sequence of iid Boolean random 
variables such that Pr{Xi^j^k = 1) = Pi^{Xi j k = 0) = 1 — p, and Y = {Yi ^ : i = 1, A^, 
and k = 1,2, ...} is a sequence of iid Boolean variables such that Pr{Yi = 1) = q, Pr{Yi = 
0) = 1 — g. We assume that the sequences X and Y of random variables are independent of 
each other and of the random variables r]T-{i, k). 

Observe that, if the z-th individual interacts with an infected j-th individual at time 
step k and Xijj. = 1, then the infection is transmitted from the j-th individual to the i-th 
individual at this time step. Thus, if some product Xij krij{j, k) takes the value 1, then 
ris{i, k + 1) = 0, meaning that the i-th individual has changed its state from susceptible to 
infected. 

The key feature of this model is the set C{i, k), representing all individuals with whom the 
i-th individual may have interacted at time step k. In a large human population, it is almost 
impossible to know C{i,k) for each individual, so we make some simplifying assumptions. 
First of all, it is clear that the spatial distribution of individuals must be reflected in the 
structure of C(i,k). We have decided to use realistic population distribution for Southern 

nn 

and Central Ontario using census data obtained from Statistic Canada . The selected 

region is mostly surrounded by waters of Great Lakes, forming natural boundary conditions. 
The data set specifies population of so called "dissemination areas" , that is, small areas 
composed of one or more neighbouring street blocks. We had access to longitude and latitude 
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data with accuracy of roughly 0.01°, hence some dissemination areas in densely populated 
regions have the same geographical coordinates. We combined these dissemination areas 
into larger units, to be called "modified dissemination areas" (MDA). 

We now define the set C{i,k) using the concept of MDAs. This set is characterized by 
two positive integers and nj. Let us label all MDAs in the region we are considering by 
integers m = 1,2,..., M, where in our case M = 5069. For the z-th individual belonging 
to the m-th MDA, the set C{i,k) consists of all individuals belonging to the m-th MDA 
plus all individuals belonging to the ric MDAs nearest to the m-th MDA and the Uf MDAs 
randomly selected among all remaining MDAs. While the "close neighbours", that is, the 
ric nearest MDAs, will not change with time, the "far neighbours", that is, the nf randomly 
selected MDAs, will be randomly reselected at each time step. 

III. DERIVATION OF MEAN FIELD EQUATIONS 

The model described in the previous section involves strong spatial coupling between 
individuals. Before we describe consequences of this fact, we first construct a set of equations 
which approximate dynamics of the model under the assumption of "perfect mixing", in 
other words, neglecting the spatial coupling. 

The state of the system described by eq. at time step k is determined by the states of 

all individuals and is described by the Boolean random field r]{k) = {r]{i, k) : i = 0, . . . , N}. 
Under the assumptions of our model, the Boolean field {T]{k) : i = 0, 1, 2 . . .} is a Markov 
stochastic process. 

By taking the expectation E^^o) of this Markov stochastic process when the initial config- 
uration is 77(0) we get the probabilities of the z-th individual being susceptible, or infected, 
or removed at time k, that is, Pr{i, k) = i?T7(o) [Vrih k)] for r G {S, I, R}. 

Since the sequences of random variables X and Y are independent of each other and 
of the sequences of the random variables rj^{i,k), assuming additionally independence of 
random variables rj^{i., k), the expected value of a product of these variables is equal to the 
product of expected values. Under these mean field assumptions, taking expected values of 
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both sides of equations we obtain 

ps{i,k + l) = ps{i,k) Yl {l-ppiU,k)), (4) 

jec{i,k) 

pj{i,k + l) = ps{i,k)[l- H {l-ppj{j,k))^+pj{z,k){l-q), (5) 

jec{i,k) 

Pnihk + l) = pR{i,k) + pi{i,k)q. (6) 

Since mean field approximations neglect spatial correlations, we further assume that Prii, k) 
is independent of i, that is pr{i, k) = Prik). Even though sets C(z, k) have different number 
of elements for different i and k, for the purpose of this approximate derivation we assume 
that they all have the same number of elements (1 + ric + nf)D, where D is the average 



MDA population. All these assumptions lead to 

Ps{k + l)=ps{k){l-ppj{k)Y'-^-^^^^f^^, (7) 

pi{k + 1) = pj{k) + ps{k) - ps{k)il - ppi{k)f'+''^^-f^^ - qpj{k), (8) 

PR{k + l) = pR{k) + qpi{k). (9) 



The third equation in the above set is obviously redundant, since ps{k) + pi{k) + pR^k) = 1. 

Similarly to the classical Kermack-McKendrick model, mean field equations (jZl)-© ex- 
hibit a threshold phenomenon. Depending on the choice of parameters, we can have 
Pi{k) < p/(0) for all k, meaning that the infection is not growing and eventually it will 
die out because in our model no new individuals are being born or arrive from outside 
the area under consideration during the time of the epidemic. Alternatively, we can have 
Pi{k) > pi{0) for some k, meaning that the epidemic is spreading. The intermediate scenario 
of constant pi{k) will occur when pi{k) = p/(0), that is, when 

Ps{0) - psmi - m(0))(i+"^+"/)^ - qpj{0) = 0. (10) 

Assuming that initially the entire population consists only of susceptible and infective in- 
dividuals, that is, there are no individuals in the removed group at A; = 0,we have ps{0) = 
1 — p/(0). Furthermore, if (1 + ric + rif)D is large, we can assume (1 — ppi{0)Y^~^^''~^'^f^^ ~ 
1 — p{l + ric + nf)Dpj{0). Solving eq. (jlUp for q under these assumptions we obtain 

q= (l-p,(0))(l+^c + ri/)Dp. (11) 

Thus, assuming the mean field approximation the epidemic can occur only if q < (^^ ^ 
pi{Q)yi + n^ + nf)Dp. 

6 



IV. SPATIO-TEMPORAL DYNAMICS OF SIR EPIDEMIC MODEL 



The mean-field equations derived in the previous section depend only on the sum of ric 
and Uf. This means, for example, that the model with ric = 12, Uf = and the model with 
ric = 11, Uf = 1 will have the same mean field equations. However, the actual dynamics in 
these two cases are very different, see Figure |21 and Figure lU Depending on the relative size 
of Uf and Uc, the epidemic may propagate or die out, as the following analysis shows. In 
order to make the subsequent analysis more convenient, we introduce parameter 7, defined 
as 



Let Nrik) be the expected value of the total number of individuals belonging to class r G 
{S, I, R}, that is. 



We say that an epidemic occurs if there exists A; > such that Nj{k) > Ni(0). For fixed 
p, Uf and Uc, there exists a threshold value of q to be denoted by Qc, such that for each 
q < Qc an epidemic occurs, and for q > it does not occur. Obviously qc depends on p, 
and this is illustrated in Figure Q which shows graphs of qc as a function of p for several 
different values of 7, where Uf + ric = 12. The graphs were obtained numerically by direct 
computer simulations of the model. The condition Uf + Uc = 12 means that the size of the 
neighbourhood is kept constant, but the proportion of "far neighbours" (represented by 7) 
varies. Figure Q also shows the mean-field line given by eq. (jllj) . 

We observe that the parameter 7 controls dynamics of the epidemic process in a significant 
way, shifting the critical line up or down. When 7 = 0, that is, when there are no interactions 
with "far neighbours" , the epidemic process has a strictly local nature, and we can observe 
well defined epidemic fronts propagating in space, regardless at which MDA the epidemic 
starts at A; = 0. This is illustrated in Figure |21 where the epidemic starts at a single 
centrally located MDA with low population density (Figure EK) and on Figure El where the 
epidemic starts in a MDA with high population density (Figure OK). The simulations were 
done for the same parameters in both cases except for the different locations of the onsets of 
epidemics. The figures display MDAs that are represented by pixels colored according to the 
density of individuals of a given type. The red component of the color represents density of 
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nc + Uf 



(12) 
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FIG. 1: Graphs of critical lines for 7 = 0.42,0.75,0.92, and 1.0. The first line from the bottom 
represents mean field approximation. 

infected individuals, green density of susceptibles, and blue density of removed individuals. 
By density we mean the number of individuals of a given type divided by the size of the 
population of the MDA. The epidemic waves propagating outwards can be clearly seen on 
Figure El and Figure El in the successive snapshots (b), (c) and (d). The fronts are mostly 
red. This means that the bulk of infected individuals is located at the fronts. After these 
individuals gradually recover the centers become blue. 

Let us now consider slightly modified parameters, taking 7 = 3^- This means that we now 
replace one "close" MDA by one "far" MDA. This does not seem to be a significant change, 
yet the effect of this change is truly noticeable. As we can see in Figure El the epidemic 
propagates much faster, and there are no visible fronts. The disease quickly spreads over 
the entire region and large metropolitan areas become red in a short time, as shown in 
Figure |3fb). This suggests that infected individuals are more likely to be found in densely 
populated regions, and their distribution is dictated by the population distribution - unlike 
in FigureElor FigureEl where infected individuals are to be found mainly at the propagating 
front. 
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FIG. 2: Example of a propagating epidemic front for 7 = 0, p = 0.00005, q = 0.05, with (a) 
k = 0, (b) k = 40, (c) k = 60 and (d) k = 80. The initial outbreak is located in an area with low 
population density. Modified dissemination areas are represented by pixels colored according to 
density of individuals of a given type, such that the red component represents density of infected, 
green density of susceptibles, and blue density of removed individuals. 

V. SPATIAL CORRELATIONS OF SIR EPIDEMIC MODEL 

In order to quantify the observations of the previous section, we use a spatial correlation 
function for densities of infected individuals defined as 

h{r,k) = (??/(^,A;)^//(J,/^))^<d(i,^■)<^+Ar•' 

where d{i,j) is the distance between i-th and j-th individual, and < • > represents averaging 
over all pairs i, j satisfying condition r < d{i,j) < r + Ar. In the following considerations 
we take Ar = 1 km. The distance between two individuals is defined as the distance between 
MDAs to which they belong. 

Consider now a specific example of the epidemic process described by eq. (001), where 
p = 0.000015, q = 0.2, and Uc+Uf = 12. For this choice of parameters epidemics always occur 



FIG. 3: Example of a propagating epidemic front parameters identical as in Figure 2, except that 
the initial outbreak is now located in an area with high population density. Color coding like in 
the previous figure. 

as long as 7 > 0. Figure El shows graphs of the correlation functions h{r, kmax) at the peak 
of each epidemic, so that kmax is the time step at which the number of infected individuals 
achieves its maximum value. An interesting phenomenon can be observed in the figure under 
consideration: while the increase of the proportion of "far" neighbours does destroy spatial 
correlations, one needs very high proportion of "far" neighbours to make the correlation curve 
completely flat. In it is reported that for influenza epidemics h{r,kmax) ~ f^-O'^iom ^ 
we fit h{r, k^ax) = Cr"' curve to the correlation data shown in Figure we obtain values of 
the exponent a as shown in Figure IHl In order to obtain a of comparably small magnitude 
as reported in Q|, one would have to take 7 equal to at least 0.83, meaning that vast 
majority of neighbours would have to be "far neighbours". In reality, this would require 
that the vast majority of all individuals one interacted with were not his/her neighbours, 
coworkers, etc., but individuals from randomly selected and possibly remote geographical 
regions. This is clearly at odds with our intuition regarding social interactions, especially 
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FIG. 4: Development of the epidemic for 7 = ^, p = 0.00005, q = 0.05, with (a) A; = 0, (b) A; = 15, 
(c) A: = 30 and (d) k = 60. Colour coding is the same as in the previous figure. 

outside large metropolitan areas. This prompted us to investigate further and to find out 
what is responsible for this effect. 

Upon closer examination of spatial patterns generated in simulations of our individually- 
based model, we reach the conclusion that the inhomogeneity of population sizes in neigh- 
bourhoods C{i,k) makes spatial correlations so persistent. Since different MDAs have dif- 
ferent population sizes, we expect that some individuals will have larger neighbourhood 
populations than others, and as a result they will be more likely to get infected, even if the 
proportion of infected individuals is the same in all MDAs. This will build up clusters of 
infected individuals around populous MDAs. 

To test if this is indeed the factor responsible for strong spatial correlations in our model, 
we replaced all MDA population sizes with a constant population size D, that is, the average 
MDA population size. As expected, graphs of the correlation functions obtained in this case 
were are all essentially flat, with the exponent a close to zero even in the case of nj = 1, 
when we obtained a = 0.023 ± 0.002. 
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FIG. 5: Graphs of the correlation function h{r, kmax) for different values of 7, where p = 0.000015, 
q = 0.2, and nc + rif = 12. 
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FIG. 6: Values of the exponent a for different values of the parameter 7. The exponent has been 
obtained by fitting h{r, kmax) = Cr°' to simulation data. 

VI. CONCLUSIONS 

We conclude that spatial correlations are difficult to destroy if neighbourhood sizes are 
inhomogeneous. Very significant amount of long-range interactions (i.e., very strong mixings) 
is required to obtain fiat correlations curves. However, for homogeneous neighbourhood sizes, 
even relatively small long-range interaction immediately forces the process into the perfect- 
mixing regime, resulting in the lack of spatial correlations. The model that we developed 
can be applied to other realistic population distributions and geographical regions. 
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